Fast computation of products by dyadic fractions with sign-symmetric rounding errors

ABSTRACT

A product of an integer value and an irrational value may be determined by a sign-symmetric algorithm. A process may determine possible algorithms that minimize metrics such as mean asymmetry, mean error, variance of error, and magnitude of error. Given an integer variable x and rational dyadic constants that approximate the irrational fraction, a series of intermediate values may be produced that are sign-symmetric. The intermediate values may include a sequence of addition, subtraction and right shift operations the when summed together approximate the product of the integer and irrational value. Other operations, such as additions or subtractions of 0s or shifts by 0 bits may be removed.

BACKGROUND

1. Field

The subject matter herein relates generally to processing, and moreparticularly to approximation techniques used in hardware and softwareprocessing.

2. Background

Arithmetic shifts may be used to perform multiplication or division ofsigned integers by powers of two. Shifting left by n-bits on a signed orunsigned binary number has the effect of multiplying it by 2n. Shiftingright by n-bits on a two's complement signed binary number has theeffect of dividing it by 2n, but it always rounds down (i.e., towardsnegative infinity). Because right shifts are not linear operations,arithmetic right shifts may add rounding errors and produce results thatmay not be equal to the result of multiplication followed by the rightshift.

In some implementations, the sign-symmetric algorithm may be used in anIDCT transform architecture or other digital filter.

One example of the use of arithmetic shifts is in fixed pointimplementations of some signal-processing algorithms, such as FFT, DCT,MLT, MDCT, etc. Such signal-processing algorithms typically approximateirrational (algebraic or transcendental) factors in mathematicaldefinitions of these algorithms using dyadic rational fractions. Thisallows multiplications by these irrational fractions to be performedusing integer additions and shifts, rather than more complex operations.

SUMMARY

A product of an integer value and an irrational value may be determinedby a sign-symmetric algorithm. A process may determine possiblealgorithms that minimize metrics such as mean asymmetry, mean error,variance of error, and magnitude of error. Given integer variable x andrational dyadic constants that approximate the irrational fraction, aseries of intermediate values may be produced that are sign-symmetric.Given a sequence of addition, subtraction and right shift operations,the sign-symmetric algorithm may approximate the product of the integerand irrational value. Other operations, such as additions orsubtractions of 0s or shifts by 0 bits may be removed to simplify theprocessing.

This summary is provided to introduce a selection of concepts in asimplified form that are further described below in the detaileddescription. This summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used to limit the scope of the claimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a plot of results of various computational algorithms.

FIG. 2 is a flow diagram of an example process of determining asign-symmetric algorithm to determine a product.

FIG. 3 is an exemplary architecture implementing a fixed-point IDCTalgorithm.

FIG. 4 is a block diagram of an exemplary encoding system.

FIG. 5 is a block diagram of an exemplary decoding system.

DETAILED DESCRIPTION

Discrete Cosine Transforms (DCT) and Inverse Discrete Cosine Transforms(IDCT) perform multiplication operations with irrational constants (i.e.the cosines). In the design of implementations of the DCT/IDCT,approximations of the computing products of these irrational constantsmay be performed using fixed-point arithmetic. One technique forconverting floating-point to fixed-point values is based on theapproximations of irrational factors α_(i) by dyadic fractions:

α_(i) ≈a _(i)/2^(k)  (1)

where both a_(i) and k are integers. Multiplication of x by factor α_(i)provides for an implementation of an approximation in integerarithmetic, as follows:

xα _(i)≈(x*a _(i))>>k  (2)

where >> denotes the bit-wise right shift operation.

The number of precision bits, k, may affect the complexity of dyadicrational approximations. In software implementations, the precisionparameter k may be constrained by the width of registers (e.g., 16 or32) and the consequence of not satisfying such a design constraint mayresult in extending the execution time for the transform. In hardwaredesigns, the precision parameter k affects the number of gates needed toimplement adders and multipliers. Hence, a goal in fixed point designsis to minimize the total number of bits k, while maintaining sufficientaccuracy of approximations.

Without any specific constraints on values for α_(i), and assuming thatfor any given k, the corresponding values of nominators a_(i) may bechosen such that:

${{\alpha_{i} - {a_{i}/2^{k}}}} = {{2^{- k}{{{2^{k}\alpha_{i}} - a_{i}}}} = {2^{- k}{\min\limits_{z \in Z}{{{2^{k}\alpha_{i}} - z}}}}}$

and the absolute error of approximations in (1) should be inverselyproportional to 2^(k):

|α_(i)−α_(i)/2^(k)|≦2^(−k−1)

That is, each extra bit of precision (i.e., incrementing k), shouldreduce the error by half.

In some implementations, the error rate may be improved if the valuesα₁, an to be approximated can be scaled by some additional parameter ξ.If α₁, . . . , α_(n) are a set of n irrational numbers (n≧2), then,there exists infinitely many n+2-tuples a₁, . . . , a_(n), k, ξ, witha₁, . . . , a_(n)εZ, kεN, and ξεQ, such that:

${\max \left\{ {{{{\xi\alpha}_{1} - {a_{1}/2^{k}}}},\ldots \mspace{14mu},{{{\xi\alpha}_{n} - {a_{n}/2^{k}}}}} \right\}} < {\frac{n}{n + 1}\xi^{{- 1}/n}2^{- {k{({1 + {1/n}})}}}}$

In other words, if the algorithm can be altered such that all of itsirrational factors α₁, . . . , α_(n) can be pre-scaled by some parameterξ, then there should be approximations having absolute error thatdecreases as fast as 2−k(1+1/n). For example, when n=2, there may beapproximately 50% higher effectiveness in the usage of bits. For largesets of factors α₁, . . . , α_(n), however, this gain may be smaller.

The dyadic approximations shown in relationships (1, 2) above, reducethe problem of computing products by irrational constants tomultiplications by integers. Multiplication of an integer by anirrational factor 1/√{square root over (2)}, using its 5-bit dyadicapproximation 23/32, illustrates the process of approximating irrationalconstants. By looking at the binary bit pattern of 23=10111 andsubstituting each “1” with an addition operation, a product of aninteger multiplied by 23 may be determined as follows:

x*23=(x<<4)+(x<<2)+(x<<1)+x

This approximation requires 3 addition and 3 shift operations. Byfurther noting that the last 3 digits form a series of “1”s, the followmay be used:

x*23=(x<<4)+(x<<3)−x,

which reduces the complexity to just 2 shift and 2 addition operations.

The sequences of operations “+” associated with isolated digits “1”, or“+” and “−” associated with beginnings and ends of runs “1 . . . 1” arecommonly referred to as a “Canonical Signed Digit” (CSD) decomposition.CSD is a well known implementation in the design of multiplier-lesscircuits. However, CSD decompositions do not always produce results withthe lowest numbers of operations. For example, considering an 8-bitapproximation of the same factor 1/√{square root over(2)}≈181/256=10110101 and its CSD decomposition:

x*181=(x<<7)+(x<<5)+(x<<4)+(x<<2)+x

which uses 4 addition and 4 shift operations. By rearranging thecomputations and reusing intermediate results, a more efficientalgorithm may be constructed:

x2=x+(x<<2);//101

x3=x2+(x<<4);//10100

x4=x3+(x2<<5);//10110101=x*181

In accordance with an implementation, the computation of products bydyadic fractions may be derived by allowing the use of right shifts aselementary operations. For example, considering a factor 1/√{square rootover (2)}≈23/32=10111, and using right shift and addition operationsaccording to its CSD decomposition, the following is obtained:

x*23/32˜(x>>1)+(x>>2)−(x>>5).  (3)

or by further noting that ½+¼=1−¼:

x*23/32˜x−(x>>2)−(x>>5).  (4)

Yet another way of computing product by the same factor is:

x*23/32˜x−((x+(x>>4))>>2)+((−x)>>6).  (5)

FIG. 1 illustrates the plots of values produced by algorithms (3, 4, and5) versus multiplication of an integer and the irrational fraction23/32. Each algorithm (3, 4, and 5) computes values that approximateproducts multiplied by the irrational fraction 23/32; however, theerrors in each of these approximations are different. For example, thealgorithm (4) produces all positive errors, with a maximum magnitude of55/32. The algorithm (3) has more balanced errors, with the magnitude ofoscillations within ±65/64. Finally, the algorithm (5) producesperfectly sign-symmetric errors with oscillations in ±7/8. Thus, asign-symmetric algorithm will produce balanced results that minimizeerrors.

The sign-symmetry property of an algorithm A_(a,b)(x)

xa_(i)/2^(b) means that for any (xεZ):

A _(a,b)(−x)=−A _(a,b)(x)

and it also implies that for any N, and assuming that A_(a,b)(0)=0:

${\sum\limits_{x = {- N}}^{N}\left\lbrack {{A_{a_{i},b}(x)} - {x\; \frac{a_{i}}{b}}} \right\rbrack} = 0$

that is, a zero-mean error on any symmetric interval.

This property is may be used in the design of signal processingalgorithms, as it minimizes the probability that rounding errorsintroduced by fixed-point approximations will accumulate. Describedbelow is the basis for right-shift-based sign-symmetric algorithms forcomputing products by dyadic fractions, as well as upper bounds fortheir complexity.

Given a set of dyadic fractions a₁/2^(b), . . . , a_(m)/2^(b), analgorithm can be defined:

A_(ai), . . . , a_(m,b)(x)

(xa₁/2^(b), . . . , xa_(m)/2^(b))

as the following sequence of steps:

x₁, x₂, . . . , x_(t),

where x₁:=x, and where subsequent values x_(k) (k=2, . . . , t) areproduced by using one of the following elementary operations:

${x_{k} ::} = \left\lbrack \begin{matrix}{{x_{i}\operatorname{>>}s_{k}};} & {{1i < k},{{s_{k}1};}} \\{{- x_{i}};} & {{1i < k};} \\{{x_{i} + x_{j}};} & {{1i},{{j < k};}} \\{{x_{i} - x_{j}};} & {{1i},{j < k},{i \neq {j.}}}\end{matrix} \right.$

The algorithm terminates when there exists indices j₁, . . . , j_(m)≦t,such that:

x_(j1)˜x*a₁/2^(b), . . . , x_(jm)˜x*a_(m)/2^(b)

Thus, some implementations examine algorithms that minimize one or moreof the following metrics: mean asymmetry:

$\chi_{A_{a_{i},b}} = {\frac{1}{2^{b}}{\sum\limits_{x = 1}^{2^{b}}{{{A_{a_{i},b}(x)} + {A_{a_{i},b}\left( {- x} \right)}}}}}$

mean error:

$\mu_{A_{a_{i},b}} = {\frac{1}{2^{b + 1}}{\sum\limits_{x = {- 2^{b}}}^{2^{b}}\left\lbrack {{A_{a_{i},b}(x)} - {x\; \frac{a_{i}}{2^{b}}}} \right\rbrack}}$

variance of error:

$\sigma_{A_{a_{i},b}}^{2} = {\frac{1}{2^{b + 1} - 1}{\sum\limits_{x = {- 2^{b}}}^{2^{b}}\left\lbrack {{A_{a_{i},b}(x)} - \mu_{A_{a_{i},b}}} \right\rbrack^{2}}}$

magnitude of error:

$\delta_{A_{a_{i},b}}^{\max} = {\max\limits_{x = {{- 2^{b}}{\ldots 2}^{b}}}{{{A_{a_{i},b}(x)} - \mu_{A_{a_{i},b}}}}}$

When computing products by multiple constants α₁/2^(b), . . . ,α_(m)/2^(b), the worst case values of the above metrics (computed foreach of the constants α₁/2^(b), . . . , α_(m)/2^(b)) may be used toevaluate efficiency of the algorithm.

FIG. 2 shows stages in a process 100 for computing a product. At 102, aninteger value is received and, at 104, rational dyadic constantsrepresentative of irrational value to be multiplied by the integer areascertained. At 106, intermediate values may be determined. For example,given the integer variable x and a set of rational dyadic constantsα₁/2^(b), . . . , α_(m)/2^(b), a series of intermediate values may bedetermined as follows:

w₀, w₁, w₂, . . . , w_(t)

where: w₀=0, w₁=x, and for all k≧2 values w_(k) are obtained as follows:

w _(k) =±w _(i)±(w _(j) >>s _(k))(i,j<k),

where ± signs imply either plus or minus operation that needs to beperformed with both terms, and >> denotes the right shift of variablez_(j) by s_(k) bits.

At 108, the points in this series corresponding to products aredetermined. That is, the result of this step are indices l₁, . . . ,l_(m)≦t, such that:

${w_{l_{1}} \approx {x\frac{\; a_{1}}{2^{b}}}},\ldots \mspace{14mu},{w_{l_{1}} \approx {x\; \frac{a_{m}}{2^{b}}}}$

At 110, the resulting output values

${w_{l_{1}} \approx {x\frac{\; a_{1}}{2^{b}}}},\ldots \mspace{14mu},{w_{l_{1}} \approx {x\; \frac{a_{m}}{2^{b}}}}$

are analyzed against certain precision metrics. For example, thesevalues can be analyzed to determine if they minimize one of mean,asymmetry, variance, magnitude.

In some implementations, the process 100 may remove additions orsubtractions of 0s, or shifts by 0 bits. In some implementations, thesequence of intermediate values may be chosen such that the totalcomputational (or implementation) cost of this entire operation isminimal.

Thus, given a set of metrics, there may be algorithms having acomplexity that can be characterized by a total number of additions, atotal number of shifts, etc. As such, there is an algorithm that has aleast number of additions, a least number of shifts, a least number ofadditions and shifts, and a least number of additions among algorithmsattaining the least number of additions and shifts, etc.

FIG. 3 illustrates an exemplary fixed-point 8×8 IDCT architecture 120.the architecture may implement algorithms that are sign-symmetric valuesof x. In many implementations, such algorithms may be the least complexfor a given set of factors. As noted above, the design of the IDCTs maybe symmetric or result in well-balanced rounding errors. In someimplementations, estimates of metrics such as mean, variance, andmagnitude (maximum values) of errors produced by the algorithms may beanalyzed. In assessing the complexity of the algorithms, the numbers ofoperations, as well as the longest execution path, and maximum number ofintermediate registers needed for computations may be taken intoaccount.

In some implementations, the architecture used in the design of theproposed fixed-point IDCT architecture 120 may be characterized havingseparable and scaled features. A scaling stage 122 may include a single8×8 matrix that is precomputed by factoring the 1D scale factors for therow transform with the 1D scale factors for the column transform. Thescaling stage 122 may also be used to pre-allocate P bits of precisionto each of the input DCT coefficients thereby providing a fixed-point“mantissa” for use throughout the rest of the transform.

In an implementation, the basis for scaled 1D transform design may be avariant of the well-known factorization of C. Loeffler, A. Ligtenberg,and G. S. Moschytz with 3 planar rotations and 2 independent factorsγ=√2. To provide for efficient rational approximations of constants α,β, δ, ε, η, and θ within the LLM factorization, two floating factors ξand ζ, may be used and applied to two sub-groups of these constants asfollows:

ξ:α′=ξα,β′=ξβ;

ζ:δ′=ζδ,ε=ζε,η′=ζη,θ′=ζθ;

These multiplications may be inverted by ξ and ζ in the scaling stage122 by multiplying each input DCT coefficient with the respectivereciprocal of ξ and ζ. That is, a vector of scale factors may becomputed for use in the scaling stage 122 prior to the first in thecascade of 1D transforms (e.g., stages 126 and 128):

σ=(1,1/ζ,1/ξ,γ/ζ,1,γ/ζ,1/ξ,1/ζ)^(T)

These factors may be subsequently merged into a scaling matrix which isprecomputed as follows:

$\Sigma = {{{\sigma\sigma}^{T}2^{S}} = \begin{pmatrix}A & B & C & D & A & D & C & B \\B & E & F & G & B & G & F & E \\C & F & H & I & C & I & H & F \\D & G & I & J & D & J & I & G \\A & B & C & D & A & D & C & B \\D & G & I & J & D & J & I & G \\C & F & H & I & C & I & H & F \\B & E & F & G & B & G & F & E\end{pmatrix}}$

where A-J denote unique values in this product:

$\begin{matrix}{{A = 2^{S}},{B = \frac{2^{S}}{\zeta}},{C = \frac{2^{S}}{\xi}},{D = \frac{{\gamma 2}^{S}}{\zeta}},{E = \frac{2^{S}}{\zeta^{2}}},} \\{{F = \frac{2^{S}}{\xi\zeta}},{G = \frac{{\gamma 2}^{S}}{\zeta^{2}}},{H = \frac{2^{S}}{\xi^{2}}},{I = \frac{{\gamma 2}^{S}}{\xi\zeta}},{J = \frac{\gamma^{2}2^{S}}{\zeta^{2}}},}\end{matrix}$

and S denotes the number of fixed-point precision bits allocated forscaling.

This parameter S may be chosen such that it is greater than or equal tothe number of bits P for the mantissa of each input coefficient. Thisallows scaling of the coefficients F_(vu), to be implemented as follows:

F′ _(vu)=(F _(vu) *S _(vu))>>(S−P)

where S_(vu)≈Σvu denote integer approximations of values in matrix ofscale factors.

At the end of the last transform stage in the series of 1D transforms(stages 126 and 128), the P fixed-point mantissa bits (plus 3 extra bitsaccumulated during executions of each of the 1D stages‡) are simplyshifted out of the transform outputs by right shift operations 130, asfollows:

f _(yx) =f′ _(yx)>>(P+3)

To ensure a proper rounding of the computed value, a bias of 2P+2 may beadded to the values f′_(yx) prior to the shifts using a DC bias stage124. This rounding bias is implemented by perturbing the DC coefficientprior to executing the first 1D transform:

F″ ₀₀ =F′ ₀₀+2^(P+2)

In some implementations, balanced (i.e., sign-symmetric) algorithms asdiscussed above may be used in the ISO/IEC 23002-2 IDCT standard. Thisstandard defines process for computation of products by the followingconstants:

y˜y*113/128,

z˜y*719/4096

and is accomplished as follows:

x2=(x>>3)−(x>>7);

x3=x2−(x>>11);

y=x2+(x3>>1);

z=x−x2;

FIG. 4 shows a block diagram of an encoding system 400, which mayinclude transforms implementing the dyadic fractions havingsign-symmetric rounding errors, as described above. A capturedevice/memory 410 may receive a source signal, perform conversion todigital format, and provides input/raw data. Capture device 410 may be avideo camera, a digitizer, or some other device. A processor 420processes the raw data and generates compressed data. Within processor420, the raw data may be transformed by a DCT unit 422, scanned by azig-zag scan unit 424, quantized by a quantizer 426, encoded by anentropy encoder 428, and packetized by a packetizer 430. DCT unit 422may perform 2D DCTs on the raw data in accordance with the techniquesdescribed herein and may support both full and scaled interfaces. Eachof units 422 through 430 may be implemented a hardware, firmware and/orsoftware. For example, DCT unit 422 may be implemented with dedicatedhardware, a set of instructions for an arithmetic logic unit (ALU), etc.

A storage unit 440 may store the compressed data from processor 420. Atransmitter 442 may transmit the compressed data. A controller/processor450 controls the operation of various units in encoding system 400. Amemory 452 stores data and program codes for encoding system 400. One ormore buses 460 interconnect various units in encoding system 400.

FIG. 5 shows a block diagram of a decoding system 500, which may includetransforms implementing the dyadic fractions having sign-symmetricrounding errors, as described above. A receiver 510 may receivecompressed data from an encoding system, and a storage unit 512 maystore the received compressed data. A processor 520 processes thecompressed data and generates output data. Within processor 520, thecompressed data may be de-packetized by a de-packetizer 522, decoded byan entropy decoder 524, inverse quantized by an inverse quantizer 526,placed in the proper order by an inverse zig-zag scan unit 528, andtransformed by an IDCT unit 530. IDCT unit 530 may perform 2D IDCTs onthe full or scaled transform coefficients in accordance with thetechniques described herein and may support both full and scaledinterfaces. Each of units 522 through 530 may be implemented a hardware,firmware and/or software. For example, IDCT unit 530 may be implementedwith dedicated hardware, a set of instructions for an ALU, etc.

A display unit 540 displays reconstructed images and video fromprocessor 520. A controller/processor 550 controls the operation ofvarious units in decoding system 500. A memory 552 stores data andprogram codes for decoding system 500. One or more buses 560interconnect various units in decoding system 500.

Processors 420 and 520 may each be implemented with one or moreapplication specific integrated circuits (ASICs), digital signalprocessors (DSPs), and/or some other type of processors. Alternatively,processors 420 and 520 may each be replaced with one or more randomaccess memories (RAMs), read only memory (ROMs), electrical programmableROMs (EPROMs), electrically erasable programmable ROMs (EEPROMs),magnetic disks, optical disks, and/or other types of volatile andnonvolatile memories known in the art.

The embodiments described herein may be implemented by hardware,software, firmware, middleware, microcode, or any combination thereof.When the systems and/or methods are implemented in software, firmware,middleware or microcode, program code or code segments, they may bestored in a machine-readable medium, such as a storage component. A codesegment may represent a procedure, a function, a subprogram, a program,a routine, a subroutine, a module, a software package, a class, or anycombination of instructions, data structures, or program statements. Acode segment may be coupled to another code segment or a hardwarecircuit by passing and/or receiving information, data, arguments,parameters, or memory contents. Information, arguments, parameters,data, etc. may be passed, forwarded, or transmitted using any suitablemeans including memory sharing, message passing, token passing, networktransmission, etc.

For a software implementation, the techniques described herein may beimplemented with modules (e.g., procedures, functions, and so on) thatperform the functions described herein. The software codes may be storedin memory units and executed by processors. The memory unit may beimplemented within the processor or external to the processor, in whichcase it can be communicatively coupled to the processor through variousmeans as is known in the art.

The stages of a method or algorithm described in connection with theembodiments disclosed herein may be embodied directly in hardware, in asoftware module executed by a processor, or in a combination of the two.A software module may reside in random access memory (“RAM”), flashmemory, read-only memory (“ROM”), erasable programmable read-only memory(“EPROM”), electrically-erasable programmable read-only memory(“EEPROM”), registers, a hard disk, a removable disk, a CD-ROM, or anyother form of storage medium known in the art. An example storage mediumis coupled to the processor, such that the processor can readinformation from, and write information to, the storage medium. In thealternative, the storage medium may be integral to the processor. Theprocessor and the storage medium may reside in an application-specificuser circuit (“ASIC”). The ASIC may reside in a user terminal. In thealternative, the processor and the storage medium may reside as discretecomponents in a user terminal.

It should be noted that the methods described herein may be implementedon a variety of hardware, processors and systems known by one ofordinary skill in the art. For example, a machine that is used in animplementation may have a display to display content and information, aprocessor to control the operation of the client and a memory forstoring data and programs related to the operation of the machine. Insome implementations, the machine is a cellular phone. In someimplementations, the machine is a handheld computer or handset havingcommunications capabilities. In another implementation, the machine is apersonal computer having communications capabilities.

The various illustrative logics, logical blocks, modules, and circuitsdescribed in connection with the implementations disclosed herein may beimplemented or performed with a general purpose processor, a DSP, anASIC, a field programmable gate array (FPGA) or other programmable logicdevice, discrete gate or transistor logic, discrete hardware components,or any combination thereof designed to perform functions describedherein. A general-purpose processor may be a microprocessor, but, in thealternative, the processor may be any conventional processor,controller, microcontroller, or state machine. A processor may also beimplemented as a combination of computing devices, e.g., a combinationof a DSP and a microprocessor, a plurality of microprocessors, one ormore microprocessors in conjunction with a DSP core, or any other suchconfiguration.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims.

1. A method for calculating products, comprising of: receiving aninteger value x; determining a set of dyadic fractions a_(i)/2^(b) . . .a_(m)/2^(b) approximating given constant factors; determining a sequenceof intermediate values, x₁ . . . x_(t), to calculate the products by:setting x₁ equal to the input integer value x; and determining x₂ . . .x_(t) in accordance with (a) at least one of x₁, . . . x_(t−1), and (b)one of a plus operation, a minus operation, or a right-shift operation;and determining indices of output values l₁, . . . , l_(m)≦t, such that:x _(l) ₁ ≈xa ₁/2^(b) , . . . , x _(l) ₁ ≈xa _(m)/2^(b).
 2. The method ofclaim 1, further comprising: determining the sequence producing outputvalues in accordance with a mean asymmetry metric, a mean error metric,a variance of error metric, and a magnitude of error metric.
 3. Themethod of claim 2, further comprising: evaluating an efficiency of thesequence of output values based on a worst case result of the meanasymmetry metric, the mean error metric, the variance of error metric,and the magnitude of error metric.
 4. The method of claim 1, furthercomprising: determining the sequence of intermediate values having aleast number of additions.
 5. The method of claim 1, further comprising:determining the sequence of intermediate values having a least number ofright shifts.
 6. The method of claim 1, further comprising: determiningthe sequence of intermediate values having the least number of additionsand right shifts.
 7. The method of claim 6, further comprising:determining the sequence of intermediate values having a least number ofadditions among the sequence of intermediate values having the leastnumber of additions and right shifts.
 8. The method of claim 1,determining x₂ . . . x_(t) further comprising: defining a member x_(k)of the intermediate values as having one of the values x_(i)>>s_(k),−x_(i), x_(i)+x_(j), or x_(i)−x_(j), where s_(k) is a number of bits toright shift x_(i), i is less than k, and j is less than k.
 9. The methodof claim 1, further comprising: determining the sign-symmetric sequenceas minimizing the relationship$\chi_{A_{a_{i},b}} = {\frac{1}{2^{b}}{\sum\limits_{x = 1}^{2^{b}}{{{{A_{a_{i},b}(x)} + {A_{a_{i},b}\left( {- x} \right)}}}.}}}$10. A computer-readable media comprising executable instructions toperform a method for calculating products, comprising instructions for:receiving an integer value x; determining a set of dyadic fractionsa_(i)/2^(b) . . . a_(m)/2^(b) approximating given constant factors;determining a sequence of intermediate values, x₁ . . . x_(t); anddetermining indices of output values l₁, . . . , l_(m)≦t, such that:x _(l) ₁ ≈xa ₁/2^(b) , . . . , x _(l) ₁ ≈xa _(m)/2^(b).
 11. The computerreadable medium of claim 10, further comprising instructions for:determining the sequence producing output values in accordance with amean asymmetry metric, a mean error metric, a variance of error metric,and a magnitude of error metric.
 12. The computer readable medium ofclaim 11, further comprising instructions for: setting x₁ equal to theinput integer value; and determining x₂ . . . x_(t) in accordance with(a) one of x₁, . . . , x_(t−1), and (b) one of a plus operation, a minusoperation, or a right-shift operation.
 13. The computer readable mediumof claim 10, further comprising instructions for: evaluating anefficiency of the sequence of output values based on a worst case resultof the mean asymmetry metric, the mean error metric, the variance oferror metric, and the magnitude of error metric.
 14. The computerreadable medium of claim 10, further comprising instructions for:determining the sequence of intermediate values having a least number ofadditions; and determining the sequence of intermediate values having aleast number of shifts.
 15. A digital signal transformation apparatus,comprising: a scaling stage that scales DCT coefficients in accordancewith a row transform and a column transform, and that pre-allocates apredetermined number of precision bits to the input DCT coefficients; atransform stage that transforms the DCT coefficients utilizingsign-symmetric dyadic rational approximations of transform constants andoutputs transformed DCT coefficients; and a right shift stage thatshifts the transformed DCT coefficients to determine output transformedDCT coefficients.
 16. The apparatus of claim 15, further comprising a DCbias stage that changes a DC biasing coefficient prior to the transformengine transforming the DCT coefficients to correct rounding errors. 17.The apparatus of claim 15, further comprising: wherein the outputtransformed DCT coefficients are IDCT coefficients.
 18. The apparatus ofclaim 15, wherein the sign-symmetric dyadic rational approximations oftransform constants use intermediate values, x₁ . . . x_(t), determinedby (a) setting x₁ equal to the input integer value, and (b) determiningx₂ . . . x_(t) in accordance with one of one of x₁, . . . x_(t−1) andone of a plus operation, a minus operation, or a right-shift operation.19. An apparatus for calculating products, comprising: means forreceiving an integer value x; means for determining a set of dyadicfractions a_(i)/2^(b) . . . a_(m)/2^(b) approximating given constantfactors; means for determining a sequence of intermediate values, x₁ . .. x_(t); and means for determining indices of output values l₁, . . . ,l_(m)≦t, such that:x _(l) ₁ ≈xa ₁/2^(b) , . . . , x _(l) ₁ ≈xa _(m)/2^(b).
 20. Theapparatus of claim 19, further comprising, wherein the sequence ofoutput values is determined in accordance with a mean asymmetry metric,a mean error metric, a variance of error metric, and a magnitude oferror metric.
 21. The apparatus of claim 19, wherein the means fordetermining the sequence sets x₁ equal to the input integer value, anddetermines x₂ . . . x_(t) in accordance with one of one of x₁, . . .x_(t−1) and one of a plus operation, a minus operation, or a right-shiftoperation.
 22. The apparatus of claim 19, wherein an efficiency isdetermined in accordance with a worst case result of the mean asymmetrymetric, the mean error metric, the variance of error metric, and themagnitude of error metric.
 23. A method for calculating a product,comprising: receiving an integer value x; determining a set of dyadicfractions a_(i)/2^(b) . . . a_(m)/2^(b) approximating given constantfactors; determining a sequence of intermediate values, x₁ . . . x_(t);and determining indices of output values l₁, . . . , l_(m)≦t, such that:x _(l) ₁ ≈xa ₁/2^(b) , . . . , x _(l) ₁ ≈xa _(m)/2^(b).
 24. The methodof claim 23, further comprising: setting x₁ equal to the input integervalue; determining x₂ . . . x_(t) in accordance with one of one of x₁, .. . x_(t−1) and one of a plus operation, a minus operation, or aright-shift operation; and determining the sequence producing outputvalues in accordance with a mean asymmetry metric, a mean error metric,a variance of error metric, and a magnitude of error metric.